12 research outputs found

    A general technique for deterministic model-cycle-level debugging

    Get PDF
    Efficient use of FPGA resources requires FPGA-based performance models of complex hardware to implement one model cycle, i.e., one time-step of the original synchronous system, in several implementation cycles. Generally implementation cycles have no simple relationship with model cycles, and it is tricky to reconstruct the state of the synchronous system at the model-cycle boundaries if only implementation-cycle-level control and information is provided. A good debugging facility needs to provide: complete control over the functioning of the target design being simulated; fast and easy access to all the significant target design state for both monitoring and modification; and some means of accomplishing deterministic execution when the target design is a multicore processor running a parallel application. Moreover, these features need to be provided in a manner which does not incur substantial resource and performance penalties. In this paper, we present a debugging technique based on the LI-BDN theory. We show how the technique facilitates deterministic model-cycle-level debugging. We used it to build the debugging infrastructure for Arete, which is an FPGA-based cycle-accurate multicore simulator. The resource and performance penalties of our debugging technique are minimal; in Arete the debugging infrastructure has area and performance overheads of 5% and 6%, respectively.IBM Researc

    Wilis: Architectural Modeling of Wireless Systems

    Get PDF
    The performance of a wireless system depends on the wireless channel as well as the algorithms used in the transceiver pipelines. Because physical phenomena affect transceiver pipelines in difficult to predict ways, detailed simulation of the entire transceiver system is needed to evaluate even a single processing block. Further, some protocol validations require simulation of rare events (say, 1 bit error in 109 bits), which means the protocol must simulate for a long enough time for such events to materialize. This requirement coupled with the heavy computation typical of most physical-layer processing, rules out pure software solutions. In this paper we describe WiLIS, an FPGA-based hybrid hardware-software system designed to facilitate the development of wireless protocols. We then use WiLIS to evaluate several microarchitectures for measuring very low bit-error rates (BER). We demonstrate, for the first time, that the recently proposed SoftPHY can be implemented efficiently in hardware

    Verification of microarchitectural refinements in rule-based systems

    Get PDF
    http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5970511&tag=1Microarchitectural refinements are often required to meet performance, area, or timing constraints when designing complex digital systems. While refinements are often straightforward to implement, it is difficult to formally specify the conditions of correctness for those which change cycle-level timing. As a result, in the later stages of design only those changes are considered that do not affect timing and whose verification can be automated using tools for checking FSM equivalence. This excludes an essential class of microarchitectural changes, such as the insertion of a register in a long combinational path to meet timing. A design methodology based on guarded atomic actions, or rules, offers an opportunity to raise the notion of correctness to a more abstract level. In rule-based systems, many useful refinements can be expressed simply by breaking a single rule into smaller rules which execute the original operation in multiple steps. Since the smaller rule executions can be interleaved with other rules, the verification task is to determine that no new behaviors have been introduced. We formalize this notion of correctness and present a tool based on SMT solvers that can automatically prove that a refinement is correct, or provide concrete information as to why it is not correct. With this tool, a larger class of refinements at all stages of the design process can be verified easily. We demonstrate the use of our tool in proving the correctness of the refinement of a processor pipeline from four stages to five.National Science Foundation (U.S.) (NSF (#CCF-0541164)

    A comparative evaluation of high-level hardware synthesis using Reed-Solomon decoder

    No full text
    Using the example of a Reed-Solomon decoder, we provide insights into what type of hardware structures are needed to be generated to achieve specific performance targets. Due to the presence of run-time dependencies, sometimes it is not clear how the C code can be restructured so that a synthesis tool can infer the desired hardware structure. Such hardware structures are easy to express in an HDL. We present an implementation in Bluespec, a high-level HDL, and show a 7.8× improvement in performance while using only 0.45× area of a C-based implementation

    A design flow based on modular refinement

    No full text
    We propose a practical methodology based on modular refinement to design complex systems. The methodology relies on modules with latency-insensitive interfaces so that the refinements can change the timing contract of a module without affecting the overall functional correctness of the system. Such refinements can exacerbate the unit testing problem for modules whose specifications admit a set of output behaviors for the same input (non-determinism), or modules whose input behavior may be affected by past outputs (feedback). We avoid the difficult problem of generating appropriate unit tests for such modules by using system-level tests as unit tests to verify the correctness of refined modules. We illustrate our methodology by showing how one might develop a microprocessor with an in-order pipeline. We then develop a superscalar pipeline using the in-order pipeline as the starting point. Our methodology leverages the effort of design exploration to reduce the effort of specifying interface contracts and unit testing.National Science Foundation (U.S.) (grant CCF- 0541164 on Complex Digital Systems

    Constructing a Weak Memory Model

    No full text
    Weak memory models are a consequence of the desire on part of architects to preserve all the uniprocessor optimizations while building a shared memory multiprocessor. The efforts to formalize weak memory models of ARM and POWER over the last decades are mostly empirical - they try to capture empirically observed behaviors - and end up providing no insight into the inherent nature of weak memory models. This paper takes a constructive approach to find a common base for weak memory models: we explore what a weak memory would look like if we constructed it with the explicit goal of preserving all the uniprocessor optimizations. We will disallow some optimizations which break a programmer's intuition in highly unexpected ways. The constructed model, which we call General Atomic Memory Model (GAM), allows all four load/store reorderings. We give the construction procedure of GAM, and provide insights which are used to define its operational and axiomatic semantics. Though no attempt is made to match GAM to any existing weak memory model, we show by simulation that GAM has comparable performance with other models. No deep knowledge of memory models is needed to read this paper

    Airblue: A System for Cross-Layer Wireless Protocol Development

    No full text
    Over the past few years, researchers have developed many crosslayer wireless protocols to improve the performance of wireless networks. Experimental evaluations of these protocols have been carried out mostly using software-defined radios, which are typically two to three orders of magnitude slower than commodity hardware. FPGA-based platforms provide much better speeds but are quite difficult to modify because of the way high-speed designs are typically implemented. Experimenting with cross-layer protocols requires a flexible way to convey information beyond the data itself from lower to higher layers, and a way for higher layers to configure lower layers dynamically and within some latency bounds. One also needs to be able to modify a layer's processing pipeline without triggering a cascade of changes. We have developed Airblue, an FPGA-based software radio platform, that has all these properties and runs at speeds comparable to commodity hardware. We discuss the design philosophy underlying Airblue that makes it relatively easy to modify it, and present early experimental results.National Science Foundation (U.S.) (NSF grant CNS-0721702)National Science Foundation (U.S.) (NSF grant CCF-0541164)National Science Foundation (U.S.) (NSF grant CCF-0811696

    Enabling genomics pipelines in commodity personal computers with flash storage

    Get PDF
    Analysis of a patient’s genomics data is the first step toward precision medicine. Such analyses are performed on expensive enterprise-class server machines because input data sets are large, and the intermediate data structures are even larger (TB-size) and require random accesses. We present a general method to perform a specific genomics problem, mutation detection, on a cheap commodity personal computer (PC) with a small amount of DRAM. We construct and access large histograms of k-mers efficiently on external storage (SSDs) and apply our technique to a state-of-the-art referencefree genomics algorithm, SMUFIN, to create SMUFIN-F. We show that on two PCs, SMUFIN-F can achieve the same throughput at only one third (36%) the hardware cost and half (45%) the energy compared to SMUFIN on an enterprise-class server. To the best of our knowledge, SMUFIN-F is the first reference-free system that can detect somatic mutations on commodity PCs for whole human genomes. We believe our technique should apply to other k-mer or n-gram-based algorithms.This work was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreements No. 639595); the Ministry of Economy of Spain under contract TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051; the ICREA Academia program; the BSC-CNS Severo Ochoa program (SEV-2015-0493); MIT via the NSF grant (CCF1725303); and travel support was provided by the MIT-Spain la Caixa Foundation Seed Fund.Peer ReviewedPostprint (published version

    The LongitudinAl Nationwide stuDy on Management And Real‐world outComes of diabetes in India over 3 years (LANDMARC trial)

    No full text
    Abstract Introduction LANDMARC (CTRI/2017/05/008452), a prospective, observational real‐world study, evaluated the occurrence of diabetes complications, glycemic control and treatment patterns in people with type 2 diabetes mellitus (T2DM) from pan‐India regions over a period of 3 years. Methods Participants with T2DM (≄25 to ≀60 years old at diagnosis, diabetes duration ≄2 years at the time of enrollment, with/without glycemic control and on ≄2 antidiabetic therapies) were included. The proportion of participants with macrovascular and microvascular complications, glycemic control and time to treatment adaptation over 36 months were assessed. Results Of the 6234 participants enrolled, 5273 completed 3 years follow‐up. At the end of 3‐years, 205 (3.3%) and 1121 (18.0%) participants reported macrovascular and microvascular complications, respectively. Nonfatal myocardial infarction (40.0%) and neuropathy (82.0%) were the most common complications. At baseline and 3‐years, 25.1% (1119/4466) and 36.6% (1356/3700) of participants had HbA1c <7%, respectively. At 3‐years, population with macrovascular and microvascular complications had higher proportion of participants with uncontrolled glycemia (78.2% [79/101] and 70.3% [463/659], respectively) than those without complications (61.6% [1839/2985]). Over 3‐years, majority (67.7%–73.9%) of the participants were taking only OADs (biguanides [92.2%], sulfonylureas [77.2%] and DPP‐IV inhibitors [62.4%]). Addition of insulin was preferred in participants who were only on OADs at baseline, and insulin use gradually increased from 25.5% to 36.7% at the end of 3 years. Conclusion These 3‐year trends highlight the burden of uncontrolled glycemia and cumulative diabetes‐related complications, emphasizing the importance of optimizing diabetes management in India
    corecore